Skip to content

Conversation

buzzingwires
Copy link
Contributor

@buzzingwires buzzingwires commented Sep 12, 2025

Motivation and Context

This change is intended to fix a regression introduced by 64db435 and first noticed when reviewing these error messages.

The aforementioned commit aimed to maintain the zhack label repair behavior of d04b5c9 by default, but performed checks and unnecessarily accessed data associated with the new -u (undetach) instruction when doing so.

This caused the old checksum repair behavior (default or -c) to fail when an uberblock with a nonzero TXG was encountered, when this behavior was only intended for -u.

The issue was not caught by old tests because there are so few transactions made on a testing zpool that some uberblocks are never assigned a TXG.

Description

  • The check for an uberblock's TXG being zero is now only performed for the -u option, which assumes a clean detach.

  • The ashift value is only accessed when needed by the -u option.

  • Determining whether to do byteswap for checksum calculation fails if the magic number doesn't match the expected for either endianness, rather than only if the magic number is zero.

  • Code is refactored to better separate it into discrete tasks and track return values.

  • In a separate commit, typesets are added to local variable declarations in test scripts to enforce their scope.

How Has This Been Tested?

Case one (zhack_label_repair_001.ksh) of the relevant tests has been revised to do the following 128 256 times: touch the test data stored in the pool, then call zpool sync on the pool. This ensures there are enough updates for every uberblock to be used, so that the checksum repair will encounter a nonzero TXG. Documentation for these tests is updated accordingly.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@github-actions github-actions bot added the Status: Work in Progress Not yet ready for general review label Sep 12, 2025
@behlendorf
Copy link
Contributor

Using the zpool sync approach for the test case should be fine. We do use test images elsewhere but mainly for impossible to reproduce situations, like verifying ancient pool versions can be imported.

@buzzingwires buzzingwires marked this pull request as ready for review September 13, 2025 22:10
@github-actions github-actions bot added Status: Code Review Needed Ready for review and testing and removed Status: Work in Progress Not yet ready for general review labels Sep 13, 2025
This commit fixes a likely regression introduced by 64db435 where the
checksum repair functionality (`-c` or default behavior) will perform
checks and access data associated with the newer undetach (`-u`)
functionality, resulting in a failure when an uberblock's TXG is not 0
as required by `-u` but not `-c`

Additionally, code is refactored for better separation of tasks.

Signed-off-by: buzzingwires <[email protected]>
@buzzingwires
Copy link
Contributor Author

Thank you for the review. Question about the shell dialect: Should I generally assume ksh93 will be used to run tests for OpenZFS? This is very much my oversight (and lack of familiarity with ksh), but I've realized typeset should probably be used to give variables local scope. Probably benign at the moment, but not great from a QA standpoint.

@behlendorf
Copy link
Contributor

Yes, for historical reasons the test suite is almost all in ksh93. Using typeset is probably technically correct here, but given how small the test scripts are in general it's rarely been a problem. I don't think it's critical to add the missing typesets, but if you'd like to add them that'd be great.

@buzzingwires
Copy link
Contributor Author

Totally understand. Just wanted to know what I was working with. I certainly will, then. Would you prefer on this PR?

@behlendorf
Copy link
Contributor

Sure, including it in this PR would be fine. Please just keep that change separate as its own commit.

As a quality assurance measure, `typeset` is added to local variable
declarations to actually enforce their intended scope.

Signed-off-by: buzzingwires <[email protected]>
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Sep 15, 2025
behlendorf pushed a commit that referenced this pull request Sep 17, 2025
As a quality assurance measure, `typeset` is added to local variable
declarations to actually enforce their intended scope.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: buzzingwires <[email protected]>
Closes #17732
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Sep 17, 2025
This commit fixes a likely regression introduced by 64db435 where the
checksum repair functionality (`-c` or default behavior) will perform
checks and access data associated with the newer undetach (`-u`)
functionality, resulting in a failure when an uberblock's TXG is not 0
as required by `-u` but not `-c`

Additionally, code is refactored for better separation of tasks.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: buzzingwires <[email protected]>
Closes openzfs#17732
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Sep 17, 2025
As a quality assurance measure, `typeset` is added to local variable
declarations to actually enforce their intended scope.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: buzzingwires <[email protected]>
Closes openzfs#17732
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants